Robustness in speech quality assessment and temporal training expiry in mobile crowdsourcing environments

نویسندگان

  • Tim Polzehl
  • Babak Naderi
  • Friedemann Köster
  • Sebastian Möller
چکیده

Following up on prior work on assessment of quality of speech in laboratory environments, this paper introduces two recently released mobile crowdsourcing paradigms. In comparison to web-based crowdsourcing, mobile crowdsourcing is carried out on smartphones or tablets in the field. Firstly, because involved hardware such as headphones cannot be known in this paradigm, we focus on the effect of mobile crowdsourcing on the assessment of quality of speech using quality degradation types which are described for the model in ITU-T Rec. P.863. As a result, indicators for degradation types that can reliably be assessed in mobile crowdsourcing paradigms are presented for the first time. This reliability is interpreted as robustness towards crowdsourcing assessment environments. Secondly, because working times, pauses and work fragmentation cannot be controlled, we introduce and focus on the analysis of temporarily expiring training certificates as qualifications. Accordingly, we design our study to automatically issue re-training job instances by timeouts, aiming at re-conditioning distracted or oblivious crowd workers. Results indicate a clear improvement in terms of correlation to laboratory test results, when applying the proposed training expiry. Eventually, the indicators presented contribute to build up preliminary guidelines on practical execution of quality assessment using mobile crowdsourcing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Controlling quality and handling fraud in large scale crowdsourcing speech data collections

This paper presents strategies for measuring and assuring high quality when performing large-scale crowdsourcing data collections for acoustic model training. We examine different types of spam encountered while collecting and validating speech audio from unmanaged crowds and describe how we were able to identify these sources of spam and prevent our data from being tainted. We built a custom A...

متن کامل

Speech Recognition in Unknown Noisy Conditions

This chapter describes our recent advances in automatic speech recognition, with a focus on improving the robustness against environmental noise. In particular, we investigate a new approach for performing recognition using noisy speech samples without assuming prior information about the noise. The research is motivated in part by the increasing deployment of speech recognition technologies on...

متن کامل

How to Design Mobile Crowdsourcing Better? Leveraging Data Integration in Prototype Testing

Mobile crowdsourcing applications often run in dynamic environments. Due to limited time and budget, developers of mobile crowdsourcing applications usually cannot completely test their prototypes in real world situations. We describe a data integration technique for developers to validate their design in prototype testing. Our approach constructs the intended context by combining real-time, hi...

متن کامل

Robust Speech Features and Acoustic Models for Speech Recognition

This thesis examines techniques to improve the robustness of automatic speech recognition (ASR) systems against noise distortions. The study is important as the performance of ASR systems degrades dramatically in adverse environments, and hence greatly limits the speech recognition application deployment in realistic environments. Towards this end, we examine a feature compensation approach and...

متن کامل

Using crowdsourcing for labelling emotional speech assets∗

The success of supervised learning approaches for the classification of emotion in speech depends highly on the quality of the training data. The manual annotation of emotion speech assets is the primary way of gathering training data for emotional speech recognition. This position paper proposes the use of crowdsourcing for the rating of emotion speech assets. Recent developments in learning f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015